Goto

Collaborating Authors

 modern neural network generalize



Modern Neural Networks Generalize on Small Data Sets

Neural Information Processing Systems

In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.



Reviews: Modern Neural Networks Generalize on Small Data Sets

Neural Information Processing Systems

This paper presents an interesting idea, which is that deep neural networks are able to maintain reasonable generalization performance, even on relatively small datasets, because they can be viewed as an ensemble of uncorrelated sub-networks. Quality: The decomposition method seems reasonable, except for the requirement for the model and the sub-nets to achieve 100% training accuracy. While there are some datasets where this will be reasonable (often high-dimensional datasets), there are others where such an approach would work very badly. That seems to me a fundamental weakness of the approach, especially if there are datasets of that nature where deep neural nets still perform reasonably well. For a random forest, we have an unweighted combination of base classifiers, but it is a learned combination in the case of the decomposed sub-networks, and the weights are tuned on the training data.


Modern Neural Networks Generalize on Small Data Sets

Olson, Matthew, Wyner, Abraham, Berk, Richard

Neural Information Processing Systems

In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.


Modern Neural Networks Generalize on Small Data Sets

Olson, Matthew, Wyner, Abraham, Berk, Richard

Neural Information Processing Systems

In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.


Modern Neural Networks Generalize on Small Data Sets

Olson, Matthew, Wyner, Abraham, Berk, Richard

Neural Information Processing Systems

In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.